Self Attention
Reference
Self Attention
讓整個Input Sequence中的每個元素都能夠關注到其他元素,這樣可以捕捉到序列中元素之間的長距離依賴關係。
Input and Output
- Input: Sequence of vectors (e.g., word embeddings, one-hot encoding)
- Output:
- N -> model -> N (e.g. POS tagging)
- N -> model -> 1 (e.g. classification)
- N -> model -> N' (e.g. translation) = Sequence to Sequence(Seq2Seq)
說明
以下的資訊都是針對N -> model -> N' (Seq2Seq)的情況
Relevant
-
Dot-Product
-
Additive